Python for Bioinformatics

This Jupyter notebook is intented to be used alongside the book Python for Bioinformatics

Chapter 6: Code Modularizing


In [3]:
len('Hello')


Out[3]:
5

Listing 6.1: netchargefn: Function to calculate the net charge of a protein


In [3]:
def protcharge(aa_seq):
    """Returns the net charge of a protein sequence"""
    protseq = aa_seq.upper()
    charge = -0.002
    aa_charge = {'C':-.045, 'D':-.999, 'E':-.998, 'H':.091,
                 'K':1, 'R':1, 'Y':-.001}
    for aa in protseq:
        charge += aa_charge.get(aa,0)
    return charge

In [5]:
protcharge('EEARGPLRGKGDQKSAVSQKPRSRGILH')


Out[5]:
4.094

In [6]:
protcharge()


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-a30dddd310c8> in <module>()
----> 1 protcharge()

TypeError: protcharge() missing 1 required positional argument: 'aa_seq'

Listing 6.2: netchargefn: Function that returns two values


In [7]:
def charge_and_prop(aa_seq):
    """ Returns the net charge of a protein sequence
    and proportion of charged amino acids
    """
    protseq = aa_seq.upper()
    charge = -0.002
    cp = 0
    aa_charge = {'C':-.045, 'D':-.999, 'E':-.998, 'H':.091,
    'K':1, 'R':1, 'Y':-.001}
    for aa in protseq:
        charge += aa_charge.get(aa,0)
        if aa in aa_charge:
            cp += 1
    prop = 100.*cp/len(aa_seq)
    return (charge,prop)

In [8]:
charge_and_prop('EEARGPLRGKGDQKSAVSQKPRSRGILH')


Out[8]:
(4.094, 39.285714285714285)

In [9]:
charge_and_prop('EEARGPLRGKGDQKSAVSQKPRSRGILH')[1]


Out[9]:
39.285714285714285

Listing 6.3: convertlist.py: Converts a list into a text file


In [11]:
def save_list(input_list, file_name):
    """A list (input_list) is saved in a file (file_name)"""
    with open(file_name, 'w') as fh:
        for item in input_list:
            fh.write('{0}\n'.format(item))
    return None

In [12]:
def duplicate(x):
    y = 1
    print('y = {0}'.format(y))
    return(2*x)

In [13]:
duplicate(5)


y = 1
Out[13]:
10

In [14]:
y


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-14-009520053b00> in <module>()
----> 1 y

NameError: name 'y' is not defined

In [16]:
def duplicate(x):
    print('y = {0}'.format(y))
    return(2*x)

In [17]:
duplicate(5)


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-17-1a5f2cffbfa4> in <module>()
----> 1 duplicate(5)

<ipython-input-16-d7e972d2aa53> in duplicate(x)
      1 def duplicate(x):
----> 2     print('y = {0}'.format(y))
      3     return(2*x)

NameError: name 'y' is not defined

In [18]:
y = 3
def duplicate(x):
    print('y = {0}'.format(y))
    return(2*x)

In [19]:
duplicate(5)


y = 3
Out[19]:
10

In [20]:
y = 3
def duplicate(x):
    y = 1
    print('y = {0}'.format(y))
    return(2*x)

In [21]:
duplicate(5)


y = 1
Out[21]:
10

In [22]:
def test(x):
    global z
    z = 10
    print('z = {0}'.format(z))
    return x*2

In [23]:
z = 1
test (4)


z = 10
Out[23]:
8

In [24]:
z


Out[24]:
10

Listing 6.4: list2textdefault.py: Function with a default parameter


In [26]:
def save_list(input_list, file_name='temp.txt'):
    """A list (input_list) is saved in a file (file_name)"""
    with open(file_name, 'w') as fh:
        for item in input_list:
            fh.write('{0}\n'.format(item))
    return None

In [27]:
save_list(['MS233','MS772','MS120','MS93','MS912'])

Listing 6.5: getaverage.py: Function to calculate the average of values entered as parameters


In [28]:
def average(*numbers):
    if len(numbers)==0:
        return None
    else:
        total = sum(numbers)
        return total / len(numbers)

In [29]:
average(2,3,4,3,2)


Out[29]:
2.8

In [30]:
average(2,3,4,3,2,1,8,10)


Out[30]:
4.125

Listing 6.6: list2text2.py: Converts a list into a text file, using print and *


In [31]:
def save_list(input_list, file_name='temp.txt'):
    """A list (input_list) is saved to a file (file_name)"""
    with open(file_name, 'w') as fh:
        print(*input_list, sep='\n', file=fh)
    return None

Listing 6.7: list2text2.py: Function that accepts a variable number of arguments


In [32]:
def commandline(name, **parameters):
    line = ''
    for item in parameters:
        line += ' -{0} {1}'.format(item, parameters[item])
    return name + line

In [33]:
commandline('formatdb', t='Caseins', i='indata.fas')


Out[33]:
'formatdb -t Caseins -i indata.fas'

In [34]:
commandline('formatdb', t='Caseins', i='indata.fas', p='F')


Out[34]:
'formatdb -t Caseins -i indata.fas -p F'

Listing 6.8: allprimes.py: Function that returns all prime numbers up to a given value


In [35]:
def is_prime(n):
    """Returns True is n is prime, False if not"""
    for i in range(2,n-1):
        if n%i == 0:
            return False
    return True

def all_primes(n):
    primes = []
    for number in range(1,n):
        if isprime(number):
            primes.append(number)
    return p

Listing 6.9: allprimesg.py: Generator that replaces putn() in code 6.8.


In [ ]:
def g_all_primes(n):
    for number in range(1,n):
        if is_prime(number):
            yield number

Modules and Packages


In [ ]:
# utils.py file
def save_list(input_list, file_name='temp.txt'):
    """A list (input_list) is saved to a file (file_name)"""
    with open(file_name, 'w') as fh:
        print(*input_list, sep='\n', file=fh)
    return None

Since utils.py is not present in this shell, the following command will retrieve this file from GitHub and store it in the local shell so it is available for importing by Python


In [1]:
!curl https://raw.githubusercontent.com/Serulab/Py4Bio/master/code/ch6/utils.py -o utils.py


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   225  100   225    0     0   2365      0 --:--:-- --:--:-- --:--:--  2368

In [2]:
import utils
utils.save_list([1,2,3])

In [4]:
!cat temp.txt


1
2
3